NLP

Emotional Chatting Machine Emotional Conversation Generation with Internal and External Memory

本文主要研究的是将情感融入到开放域对话生成中,模型主要包含:对情感进行编码,捕获隐式地情感状态变化,使用显式地外部情感词汇;实验证明本文提出的模型(ECM)在生成内容和情感上都取得了一定的效果。
paper link
code link

Introduction

在之前的对话生成领域,考虑情感因素的并不多,大多数研究关注于生成的内容,作者在论文中举例说明融入emotion的优势:

Table  1:  Conversations  with/without  considering  emotion

融入情感目前面临以下问题:

  • 缺乏大规模标注数据,情感分类也不明确,主观性太大
  • 很难平衡语法正确与情感表达之间的关系
  • 情感表达很难感知,很多时候是隐式的

本文的主要贡献在于:

  • 首次在大规模对话生成领域应用情感。
  • 提出了一个端到端的CCM模型来融入情感,包含三个模块:emotion category embedding,an internal emotion memoryan external memory

Emotional Chatting Machine

Task Definition and Overview

问题定义如下:给定一个post $X = (x_{1},x_{2},…,x_{n})$ 和对应的情感类别e,目标是生成回复$Y=(y_{1},y_{2},…,y_{m})$。

本文中情感的类别共有 {angry,disgust,happy,like,sad,other},在本模型中,生成回复的情感类别是指定的,因为对于给定的post,并不存在一个最优的情感类别,针对于不同的回复者,所携带的情感是不同的。具体如何选择一个合适的情感类别,并非本文的内容。

Figure  1:  Overview  of  ECM  (the  grey  unit).  The  pink  units  are  used  to  model  emotion  factors  in  the  framework

在训练阶段,首先要先通过一个情感分类器得到情感标签;在预测时,需要显式指定情感类别。

ECM主要包括三个部分:

  • ECM学习情感类别的分布式表征,将其输入到解码器。
  • 在解码的时候,维护一个隐式的情感状态,动态更新,以平衡语法和情感二者的影响。
  • 通过一个外部记忆模块显式地从情感字典选择一个词,或者正常的生成一个通用词。

Emotion Category Embedding

针对于每一类情感e,随机初始化向量表征 $v_{e}$,之后在训练过程中学习。得到的$v_{e}$会输入到解码器:

Internal Memory

在生成的过程中,情感embedding是固定的,有可能会牺牲生成语法的准确性。

Inspired by the psychological findings that emotional responses are relatively short lived and involve changes (Gross 1998; Hochschild 1979), and the dynamic emotion situation in emotional responses (Alam, Danieli, and Riccardi 2017), we design an internal memory module to capture the emotion dynamics during decoding.

there is an internal emotion state for each category before the decoding process starts; at each step the emotion state decays by a certain amount; once the decoding process is completed, the emotion state should decay to zero indicating the emotion is completely expressed.

Figure  2:  Data  flow  of  the  decoder  with  an  internal  memory.  The  internal  memory  $M_{e,t}^{I}$  is  read  with  the  read  gate $g_{t}^{r}$  by  an  amount $M_{r,t}^{I}$   to  update  the  decoder’s  state,  and  the memory  is  updated  to  $M_{e,t+1}^{I}$  with  the  write  gate  $g_{t}^{w}$

在每一个时间步t,ECM模型计算读取率 $g_{t}^{r}$ 和写入率 $g_{t}^{w}$:

读取门和写入门分别用来从Internal Memory读取和写入,如下:(中间的符号代表element-wise multiplication)

GRU通过下式来更新状态:

(9)式中 $g_{t}^{w}$ 的值介于0到1之间,相当于对$M_{e,t}^{I}$连乘,导致emotion state逐渐衰减,与假设符合。

External Memory

In the internal memory module, the correlation between the change of the internal emotion state and selection of a word is implicit and not directly observable.

针对于不同的词汇所表达情感的程度,作者提出了一个External Memory Module来建模显式的情感表达,给情感词汇和普通词汇赋予不同的生成概率,因此,模型可以选择从情感词汇和普通词汇中选择生成一个词。

Figure  3:  Data  flow  of  the  decoder  with  an  external  memory.  The  final  decoding  probability  is  weighted  between  the emotion  softmax  and  the  generic  softmax,  where  the  weight is  computed  by  the  type  selector.

$P(y_{t})$ 是$P_{g}$和$P_{e}$拼接,因为两个词汇表没有公共的词。

Experiments

Table  4:  Objective  evaluation  with  perplexity  and  accuracy

Figure  4:  Sample  responses  generated  by  Seq2Seq  and  ECM  (original  Chinese  and  English  translation,  the  colored  words  are the  emotion  words  corresponding  to  the  given  emotion  category).  The  corresponding  posts  did  not  appear  in  the  training  set.

Table  6:  Manual  evaluation  of  the  generated  responses  in  terms  ofContent(Cont.)  andEmotion(Emot.).